2 research outputs found

    “And all the pieces matter...” Hybrid Testing Methods for Android App's Privacy Analysis

    Get PDF
    Smartphones have become inherent to the every day life of billions of people worldwide, and they are used to perform activities such as gaming, interacting with our peers or working. While extremely useful, smartphone apps also have drawbacks, as they can affect the security and privacy of users. Android devices hold a lot of personal data from users, including their social circles (e.g., contacts), usage patterns (e.g., app usage and visited websites) and their physical location. Like in most software products, Android apps often include third-party code (Software Development Kits or SDKs) to include functionality in the app without the need to develop it in-house. Android apps and third-party components embedded in them are often interested in accessing such data, as the online ecosystem is dominated by data-driven business models and revenue streams like advertising. The research community has developed many methods and techniques for analyzing the privacy and security risks of mobile apps, mostly relying on two techniques: static code analysis and dynamic runtime analysis. Static analysis analyzes the code and other resources of an app to detect potential app behaviors. While this makes static analysis easier to scale, it has other drawbacks such as missing app behaviors when developers obfuscate the app’s code to avoid scrutiny. Furthermore, since static analysis only shows potential app behavior, this needs to be confirmed as it can also report false positives due to dead or legacy code. Dynamic analysis analyzes the apps at runtime to provide actual evidence of their behavior. However, these techniques are harder to scale as they need to be run on an instrumented device to collect runtime data. Similarly, there is a need to stimulate the app, simulating real inputs to examine as many code-paths as possible. While there are some automatic techniques to generate synthetic inputs, they have been shown to be insufficient. In this thesis, we explore the benefits of combining static and dynamic analysis techniques to complement each other and reduce their limitations. While most previous work has often relied on using these techniques in isolation, we combine their strengths in different and novel ways that allow us to further study different privacy issues on the Android ecosystem. Namely, we demonstrate the potential of combining these complementary methods to study three inter-related issues: • A regulatory analysis of parental control apps. We use a novel methodology that relies on easy-to-scale static analysis techniques to pin-point potential privacy issues and violations of current legislation by Android apps and their embedded SDKs. We rely on the results from our static analysis to inform the way in which we manually exercise the apps, maximizing our ability to obtain real evidence of these misbehaviors. We study 46 publicly available apps and find instances of data collection and sharing without consent and insecure network transmissions containing personal data. We also see that these apps fail to properly disclose these practices in their privacy policy. • A security analysis of the unauthorized access to permission-protected data without user consent. We use a novel technique that combines the strengths of static and dynamic analysis, by first comparing the data sent by applications at runtime with the permissions granted to each app in order to find instances of potential unauthorized access to permission protected data. Once we have discovered the apps that are accessing personal data without permission, we statically analyze their code in order to discover covert- and side-channels used by apps and SDKs to circumvent the permission system. This methodology allows us to discover apps using the MAC address as a surrogate for location data, two SDKs using the external storage as a covert-channel to share unique identifiers and an app using picture metadata to gain unauthorized access to location data. • A novel SDK detection methodology that relies on obtaining signals observed both in the app’s code and static resources and during its runtime behavior. Then, we rely on a tree structure together with a confidence based system to accurately detect SDK presence without the need of any a priory knowledge and with the ability to discern whether a given SDK is part of legacy or dead code. We prove that this novel methodology can discover third-party SDKs with more accuracy than state-of-the-art tools both on a set of purpose-built ground-truth apps and on a dataset of 5k publicly available apps. With these three case studies, we are able to highlight the benefits of combining static and dynamic analysis techniques for the study of the privacy and security guarantees and risks of Android apps and third-party SDKs. The use of these techniques in isolation would not have allowed us to deeply investigate these privacy issues, as we would lack the ability to provide real evidence of potential breaches of legislation, to pin-point the specific way in which apps are leveraging cover and side channels to break Android’s permission system or we would be unable to adapt to an ever-changing ecosystem of Android third-party companies.The works presented in this thesis were partially funded within the framework of the following projects and grants: • European Union’s Horizon 2020 Innovation Action program (Grant Agreement No. 786741, SMOOTH Project and Grant Agreement No. 101021377, TRUST AWARE Project). • Spanish Government ODIO NºPID2019-111429RB-C21/PID2019-111429RBC22. • The Spanish Data Protection Agency (AEPD) • AppCensus Inc.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Srdjan Matic.- Secretario: Guillermo Suárez-Tangil.- Vocal: Ben Stoc

    Study on privacy of parental control mobile applications

    Full text link
    Parental control applications are one kind of mobile software programs, which are used by parents to monitor and control the use that their kids make of their cellphone. Parents install these type of apps on their children's phones in order to remotely set rules for what the children can do with their device and to monitor where the phone is and what their kids are using it for. These type of applications are highly intrusive, because they gather all kinds of data that re ect private information about the users, such as their Internet history, text messages, calls, location. . . Furthermore, these data are often sent to servers hosted in the Internet, where the information is gathered in order to let the parent access it later from a different device than their child's phone. Therefore, these systems make it possible (willingly or not) for third parties to access all or parts of these private data. From a privacy stand point, these applications pose a great threat to users. However, there has not yet been a study on the amount and kind of information that these apps gather and the security of their communications. In this thesis, we study how parental control apps behave, and the features that they provide the user with, and we conduct several studies to understand their privacy implications. First, we research how well these applications explain their behavior and operation to their users. Second we gather information about the permissions that these apps request and compare them to their behavior. Third we study what information is gathered and later sent to Internet servers by these programs. Finally, we investigate if this information is sent securely. We studied fourteen different parental control applications, anyhow we only could perform all of the studies explained above in seven of them. In each one of those seven apps we find at least one of these privacy issues: the sending of sensitive information to third parties, the leakage of private data from before the parental control app installation, the sending of private information before the user agrees with the term of usage (sometimes even the fact that the user never agrees to it), or the sending of private data via an insecure communication channel. We also find that 50% of the fourteen studied applications did not clearly explain to the user that their children's data was being sent through the Internet and stored in servers. Finally, we categorize 15% of the requested permissions in eleven of the fourteen applications as confusing and probably unnecessary.---ABSTRACT---Las aplicaciones de control parental son un tipo de programas software para teléfonos móviles, usados por los padres para motorizar y controlar el uso que hacen sus hijos de sus teléfonos. Los padres instalan este tipo de apps en los teléfonos de los hijos, pudiendo así establecer reglas para establecer el uso que pueden hacer los hijos de sus teléfonos y monitorizar de manera remota la localización del teléfono y qué uso están haciendo los niños su teléfono. Este tipo de aplicaciones son altamente intrusivas, ya que recogen una gran cantidad de datos que re ejan información privada sobre los usuarios, como su historial web, mensajes de texto, llamadas, localización. . . Además, estos datos suelen ser enviados a servidores localizados en Internet donde la información se guarda para que los padres puedan acceder después a esta información de manera remota, desde un dispositivo distinto al teléfono de su hijo. Por lo tanto, estos sistemas comprometen (de manera intencionada o no) esta información, de modo que otros agentes podrían acceder a todos o parte de estos datos privados. Desde el punto de vista de la privacidad, estas aplicaciones representan una gran amenaza para los usuarios. Sin embargo, todavía no ha habido ningún estudio sobre la información que dichas aplicaciones recogen y la seguridad de las mismas. En esta tesis estudiamos como se comportan este tipo de programas y las funcionalidades que le presentan al usuario. También realizamos diferentes estudios para conocer las implicaciones de privacidad de las aplicaciones de control parental. Primero, analizamos cómo de claro explican estas aplicaciones su modelo de funcionamiento, segundo recogemos información sobre los permisos que solicitan las aplicaciones que analizamos, tercero estudiamos qué información es recolectada y enviada a servidores de Internet por parte de estos programas y cuarto, investigamos si esta información es enviada de una forma segura. Estudiamos catorce aplicaciones de control parental, en el caso de siete de estas realizando cada uno de los estudios arriba explicados. En cada una de estas siete aplicaciones descubrimos al menos uno de los siguientes problemas de privacidad: el envío de información especialmente sensible a terceros, el filtrado de datos privados anteriores a la instalación de la aplicación de control parental, el envío de información privada antes de que el usuario acepte los términos de uso (incluso el hecho de que el usuario nunca llegue a aceptar este acuerdo) y el envío de datos confidenciales a través de canales de comunicación inseguros. También encontramos que el 50% de las catorce aplicaciones analizadas no explican de una manera clara para el usuario que los datos de su hijo son enviados a través de Internet y guardados en servidores. Por último, categorizamos el 15% de los permisos solicitados por once de las catorce aplicaciones del estudio como confusos y probablemente innecesarios
    corecore